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ABSTRACT 


The  purpose  of  this  work  is  to  investigate  the  number 
of  arithmetic  operations  required  by  algorithms  which 
evaluate  polynomials.  Previous  results  show  that  a  polynomial 
of  degree  n  requires  at  least  n/2  multiplicatlon/divisions 
and  at  least  n  addition/subtractions  for  its  evaluation  if 
the  coefficients  of  the  polynomial  are  suitably  Independent 
Irrational  numbers.  However,  the  coefficients  of  any  polyno¬ 
mial  that  would  be  evaluated  in  practice  are  represented  only 
to  a  finite  accuracy  and  are  therefore  rational  numbers. 

The  above  results  are  extended  to  show  that  the  same  lower 
bounds  hold  for  almost  all  rational  polynomials  if  the  polyno¬ 
mial  is  being  evaluated  efficiently.  Another  lower  bound 
result  is  given  that  shows  that  almost  all  rational  polynomials 
of  degree  n  require  at  least  Jn  multiplication/divisions  for 
their  evaluation  by  any  algorithm,  efficient  or  not. 

Several  algorithms  are  presented  which  con  in  theory 
evaluate  any  rational  polynomial  using  0(,/n)  multiplications 
and  many  additions.  While  of  no  practical  use  for  rational 
polynomials  In  general,  these  algorithms  do  turn  out  to  give 
methods  for  evaluating  a  polynomial  at  a  matrix  argument 
which  are  more  efficient  than  previous  methods. 
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I .  Introduction. 

One  purpose  of  computational  complexity  is  to  find  lower 
bounds  on  the  decree  of  difficulty  involved  in  performing  some 
computation  or  class  of  computations  under  some  measure  of 
difficulty.  Another  aim  is  then  to  show  that  the  lower  bound 
i?  "tight”  by  exhibiting  an  algorithm  which  performs  the 
computation  with  a  degree  of  difficulty  close  to  i.he  lower 
bound . 

This  paper  will  consider  the  problem  of  evaluating  a 
polynomial  in  one  variable  with  real  coefficients, 

p<x>  =  Vn  +  “m-i*"'1  +“‘+  H1X  +  “o  ■ 

S0.  »!.•••.  %  e  P. 

Polynomials  arise  very  often  in  practice  and  many 
functions  are  evaluated  by  evaluating  a  finite  portion  of 
their  Taylor  polynomial. 

The  measure  of  difficulty  will  be  the  number  of  individual 
arithmetic  operations  +,  used  to  evaluate  the 

polynomial.  We  will  be  Interested  in  how  the  required  number 
of  operations  grows  as  a  function  of  n,  the  degree  of  the 
polynomial.  The  greatest  emphasis  will  be  on  counting  the 
number  of  multiplications  and  divisions. 

Algorithms  will  consist  of  a  sequence  of  arithmetic 
operations.  At  each  step,  each  of  the  tvjo  arguments  for  the 
operation  may  be  some  input  variable,  seme  fixed  number, 
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or  the  result  of  some  previous  step.  This  Is  formalized 
In  the  following  definition. 

Definition .  Let  S  be  an  Infinite  field  and  V  be  a  finite  set 
of  input  variables.  A  rational  algorithm  A  over’S  Is  a 
sequence  A  =  a(l),  a (2),  *•*,  a(k)  where 

1 )  .  a ( 1 )  €  S  U  V  and 

2) .  For  2  <  r  <  k  either  a(r)  e  S  U  V 

or  a(r)  =  (op,  1,  j) 
where  1  <  l,j  <  r  and  op  e  (+,  #,  V} 

Let  S(V)  denote  the  extension  field  formed  by  attaching  the 
lndeterminates  in  V  to  the  field  S  and  closing  under  the 
rational  operations. 

Def lne  the  associated  elements  p^ ,  p? ,  •  •  •  ,  Pk  €  5 ( V )  U 
s.  s 

f  a(r)  If  a(r)  €.  S  U  V 

P«  op  p.  if  (Jr)  =  (op,  1,  j)  and  p.  /  80 , 
p  =  y  J 

r  A  and  Pj  /  00  and  either 

op  4  or  p j  4.  0 

^  00  otherwise 

Now  A  computes  the  function  f  €  S(V)  if 
f  =  p,,  for  some  r,  1  <  r  <  k. 

For  example,  to  evaluate  a  polynomial  p(x)  with  real 
coefficients,  one  could  choose  S  =  R  and  V  =  £x|  . 
(Throughout  the  remainder  of  this  paper,  C  will  denote  the 
field  of  complex  numbers,  R  will  denote  the  field  of  real 
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numbers,  Q  the  field  of  rational  numbers,  and  Z  the  ring  of 
integers) . 

Quite  a  bit  of  work  has  already  been  done  concerning 
the  complexity  of  polynomial  evaluation  algorithms.  The 
new  work  presented  here  was  initially  motivated  by  considering 
polynomials  with  rational  coefficients.  The  only  known  lower 
bound  arguments  show  that  at  least  n/2  multiplications  are 
required  to  evaluate  a  decree  n  polynomial  but  assume  that  the 
coefficients  aof  al*  *  *  *  *  an  °**  P°^ynom^al  are  alge¬ 
braically  independent  real  numbers,  that  is,  there  is  no 
rational  polynomial  P  £  Q[yQ,  y1#  •••,  yn],  P  £  0, 
such  that  P(aQ,  a^,  •••,  a  )  =  0. 

Such  an  assumption  is  necessary  to  obtain  interesting 
lower  bounds  because  some  degree  n  polynomials,  such  as 
p(x)  =  xn  +  xn_1  +  xn-2  +  •••  +  x  +  1  can  be  evaluated  in 
0(log  n)  operations. 

If  the  coefficients  of  tne  polynomial  are  rational 
numbers,  then  they  are  necessarily  algebraically  dependent 
and  the  known  lower  bound  methods  do  not  directly  apply. 

In  fact,  any  rational  polynomial  of  degree  n  can  in  theory 
be  evaluated  using  only  0(,/n)  multiplications  as  will  be 
seen  later.  The  coefficients  of  any  polynomial  that  we 
would  want  to  evaluate  in  practice  would  be  represented 
only  to  a  fixed  number  of  decimal  places  and  would  therefore 


be  rational  numbers. 
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Before  presenting  the  results  for  rational  polynomials, 
the  next  section  will  give  o  brief  survey  of  previous  and 
current  results  concerning  polynomial  evaluation. 
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II.  Survey  of  Previous  Work. 

Previous  algorithms  and  lower  bounds  fall  into  two  main 
classes,  those  which  use  preprocessing  and  those  that  do  not. 

The  latter  class,  no  preprocessing,  use  the  coefficients 
a0.  ,  * ' ' •  en  °f  the  polynomial  being  evaluated  as  the 

fixed  numbers  (scalars)  which  enter  into  the  algorithm. 

The  best  known  algorithm  of  this  type  is  the  scheme  known 
as  Horner's  rule  which  evaluates 

P(x)  =  anxn  +  an-1xn_1  +  •••  +  a1x  +  aQ  as 

Algorithm  A. 

p(x)  =  (((•••(  anx  +  an_i)x  +  fin_2  +  •••  +  ax)x  +  a0 

in  n  multiplications  and  n  additions. 

Preprocessing  algorithms  use  as  scalars  certain  real 
numbers  which  are  precomputed  from  the  a^.  There  are  pre¬ 
processing  algorithms  which  1 se  about  n/2  multiplications 
and  n  additions.  These  algorithms  are  most  useful  when  the 
polynomial  is  to  be  evaluated  many  times  at  different  points. 

The  preprocessing  need  only  be  done  once  and  n/2  multiplications 
are  then  saved  over  Horner's  rule  each  time  the  polynomial 
is  evaluated. 
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2.1  Preprocessi 


n£« 


A  preprocessing  Algorithm  is  presented  first. 

^regl  (f-otzkln  [1],  Pan  [21).  Any  degree  n  real 

polynomial  can  be  evaluated  using  [n/2 J  +  2  multiplications 
and  n  additions. 

Proof: 

Alvorlthmjs.  p(x)  =  a„r"  +  . . .  +  alX  +  a0  Is  evaluated  by 
the  scheme 

y=x+c  w=y2 

Z  *  ^an^  +  s0}y  +  ^0  even) 

z  =  any  +  tQ  (n  odd) 

P(x)  =  (((•••(z(w  -  sx)  +  tx)(w  -  s2)  +  t?)  •••)(w  -  s  ) 

c.  m 

where  m  =  f n/2 1  -  1 

and  c.  slt  tjL  (i  =  0,--.f  m)  are  certain  real  numbers 
which  are  found  in  the  following  manner. 

First  note  that  deg(p)  :  n  implies  that  p  can  be 
written  as  p(x)  =  (x2  -  sn,).q(x)  +  t 

where  deg(q)  =  n  -  2  and  s  .  t  e  c. 

m*  m 

If  p(x)  =  anxn  +  •••  +  axx  +  aQ  ,  then  sffl  is  a  root  of  the 
auxiliary  polynomial  formed  from  p  by  taking  each  “odd" 


coefficient 


r(x)  =  a. 


xm  +  • • •  +  a~x  +  a- 


FAUX  ' A '  ~  a2m+l~  '  —  T  t  • 
tffi  is  found  by  a  polynomial  division.  Now  apply  this  to  q(x), 

finding  a  sm-1  and  tm_lt  and  continue.  It  turns  out  that 
sl’  “*•  sm  are  the  m  roots  of  paux(x)  and  the  t  are 
obtained  from  the  s 1  and  &1  by  polynomial  divisions.  Note 
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that  the  s1  and  t ^  may  be  complex  even  though  the  a1  are  real, 
this  Is  undesirable  because  complex  arithmetic  requires  at 
least  twice  as  many  real  operations  as  real  arithmetic  and 
any  advantage  over  a  non-preprocessing  algorithm  is  lost. 
However  J.  Eve  [3l  has  noted  that 

if  p(y)  =  p(y  -  C)  =  p(x)  . 

then  all  the  roots  of  Paux(y)  are  real  for  an  appropriate 
real  c.  _ / 

For  a  further  discussion  of  Algorithm  B,  see  Knuth  [4], 

Turning  now  to  the  problem  of  finding  a  lower  bound, 
it  Is  possible  to  show  that  at  least  (*n/2l  multiplication/ 
divisions  are  required  to  evaluate  a  degree  n  polynomial  if 
the  coefficients  are  algebraically  Independent  real  numbers. 

A  mult 1 pll ca t 1 on/d ivis Ion  operation  Is  either  a  multiplication 
or  a  division.  It  Is  Instructive  to  view  this  proof  In  some 
detail  as  the  methods  will  b<  extended  and  used  later.  First, 
the  concept  of  algebraic  independence  Is  defined. 

Definition.  a1#  •  •  • .  a„  C  R  are  said  to  be  algebraically. 
Independent  if  there  is  no  P  €  Q^.  •••»  yj*  P  ^  °* 
such  that  P(a^,  a^)  =  0. 

The  first  lemma  shows  the  maximum  amount  of  computation 
that  can  be  done  with  a  certain  number  of  multiplication/ 


divisions. 
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—.IT77!** _ ff.l.  A  polynomial  p(x)  can  be  evaluated  In  no  more  than 

k  mul tl plica tl on/d Ivisions  by  a  rational  algorithm  over  R  if 
and  only  if  it  can  be  evaluated  by  the  scheme  Sj,  given  by 

U1  =  (miox  +  si^  °Pi  (®iox  +  ) 

V2  =  ^m21ul  +  m20x  +  s2 ^  op2  ^™21U1  +  “20X  +  ®2^ 
in  general 


u. 


=  (rR1m 


_rrii 


1=1 


u.  +  m 


rO 


*  +  sr>  opr  (i|)i“rl'Il  +  ”r0,:  +  pr> 

r  =  2,  •  •  • ,  k 


and  finally  p(x) 


I 


^  1  4«4  + 

1=1  k+1 , i  i 


mk+l,Ox  +  sk+l 


where 


raij’  ffiij  6  Z 
s1#  s1  e  r 

op1  €  {*. 


for  all  i,  J 
for  all  i 
for  all  i 


£rP.of.  s  The  proof  is  complete  once  it  is  noticed  that  after 
r-1  multiplication/division';  have  been  performed,  the  only 
computation  that  can  be  performed  without  doing  another 
multiplication/division  is  addition/subtractions  on 
(x,  ulf  u2,  •  •• ,  ur_1'|  U  R 

where  uj[  is  the  result  of  the  1-th  multiplication/division. 
Any  such  computation  can  be  written  as 

r-1 

iSfrlVl  +  mrOx  +  sr 

where  my^  e  Z  1=0,  • • • ,  r-1  and  sr  e  B. 

The  integer  ''multiplication”  rar^u^  Is  Just  shorthand  for 
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mri  repeated  additions  of  u^ 

Two  terms  of  this  type  can  be  formed  and  combined  with 
a  multiplication  or  division  to  get  the  next  multiplication/ 
division  step  ur>  _ / 

The  next  lemma  is  also  necessary. 

Lemma  2 . 2  Suppose  p^  6  Q(x^,  xffl)  i  =  n. 

That  is,  there  are  n  rational  functions,  each  in  the  m 
variables  x^ ,  •  •  •  ,  x^  .  If  n  >  m,  then  there  exists  a 
multivariate  rational  polynomial  P  €  Q[y1 ,  *•*,  y  1  ,  P  i.  0, 
such  that  P(p1(x1,  ••*,  x^),  •••,  Pn(xlf  •••,  xffi))  =  0 
for  all  Xj_,  ••*,  xm  such  that  p^  4  °°  .  i  =  1,  ••*,  n. 
Informally,  if  there  are  more  functions  than  variables,  then 
they  satisfy  a  non-trlvial  polynomial  relation. 

A  formal  proof  of  this  lemma  will  not  be  given.  Very 
Informally,  if  p1 ,  ••*,  pn  were  algebraically  independent, 
then  Q(plt  Pn)  would  have  degree  of  transcendence  n 

over  Q.  But  since  the  p^  are  rational  functions  in  the  x^, 
Q(Pl»  **•*  Pn)  £  Q(xlf  •••,  xffl)  which  has  degree  of  trans¬ 
cendence  m  over  Q.  n  >  m  gives  the  contradiction.  For 
further  details  see  for  example  [5].  _ / 


The  following  lower  bound  result  can  now  be  proven. 
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Theorem  2.2  (Motzkin[l],  Wlnograd  [6]). 

Any  p(x)  =  anx”  +  •••  +  a^  +  aQ  with  aQ,  a^,  •••,  aR 
algebraically  Independent  real  numbers  requires,  at  leact 
fn/z]  multiplication/divisions  for  Its  evaluation  by  any 
rational  algorithm  over  R. 

Proof  :  Assume  p(x)  can  be  evaluated  In  k  multiplication/ 
divisions  and  therefore  by  the  scheme  Sk  for  some  choice  of 

“ij*  “lj  6  Z*  81»  S1  6  R*  opl  e  l*’  * 

Formally  carry  out  the  operations  In  Sk  and  view  the  result 

as  a  polynomial  In  x, 

P(x)  *  pn(s)xn  +  •••  +  Pj(s)x  +  P0(s) 

*  anxn  +  • • •  +  atx  +  a0  , 

where  each  Pj^  Is  a  rational  function  In 

s  =  (s1§  slt  •••,  sk,  sk,  sk+1). 

Note  that  there  are  2k  +  1  variables  In  "s. 

Now  If  2k  +  1  <  n  +  1,  then  by  Lemma  2.2  there  Is  a 

rational  polynomial  P  jL  0  such  that 

P(Pn(s),  •••,  p0(s))  =  0 

or  P(an,  •••»  alt  Bq)  *  0  which  Is  a  contradiction. 
Therefore  2k  +  1  >  n  +  1  or  k  >  n/2.  J 

It  Is  useful  to  think  of  the  s^,  s^  as  representing 
degrees  of  freedom.  The  set  of  all  degree  n  polynomials  has 
n  +  1  degrees  of  freedom  because  each  coefficient  can  be 
varied  Independently.  Lemma  2.1  states  that  each  multiplication/ 
division  can  Introduce  at  most  two  degrees  of  freedom  into  the 
algorithm.  The  concept  of  degrees  of  freedom  is  formalized  In 
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Knuth  [4],  but  the  method  of  proof  of  Theorem  2.2  using 
degrees  of  freedom  Is  essentially  the  one  presented  here 
Using  similar  techniques.  It  can  be  shown  the t 


Theorem  2.3  (Eelaga  [7]).  Any  p(x)  =  a„x"  +  —  +  al*  +  “o 
with  a0.  a1(  •••.  an  algebraically  Independent  real  numbers 

requires  at  least  n  addition/subtractions  for  Its  evaluation 

by  any  rational  algorithm  cvei  R» 

A  proof  is  also  presented  in  Knuth  [4]  .  It  is  done 

by  showing  that  each  addition/ subtract  ion  can  introduce  at 
most  one  degree  of  freedom  except  for  a  single  addition/ 
subtraction  which  can  Introduce  two.  - / 


Therefore  Algorithm  B  is  an  almost  optimal  (±  two 
multiplications)  method  for  polynomial  evaluation  with 
preprocessing. 


2. 2  Rational  Preprocessing 

Part  of  the  preprocessing  required  by  Algorithm  B 
involves  finding  all  the  roots  of  a  polynomial  of  degree 
n/2.  This  may  ba  computationally  difficult  In  Itself  and  may 
lead  to  inaccuracies  In  the  actual  evaluated  values  of  the 
polynomial.  The  latter  Is  true  because  even  If  the  coefficients 
of  the  polynomial  being  evaluated  are  rational  numbers  and  have 
a  finite  decimal  representation,  the  scalars  used  by  the 
algorithm  may  be  Irrational  numbers  which  cannot  be 
represented  as  their  exact  value  but  must  be  rounded  to  a 


The  following  algorithm, 


1.6 

finite  number  of  decimal  places, 
discovered  by  Michael  Paterson,  has  the  advantage  that  the 
preprocessing  involves  only  the  rational  arithmetic  opera¬ 
tions.  This  advantage  is  obtained  at  the  cost  of  doing  a 
few  more  multiplications  during  the  actual  evaluation  of 
the  polynomial. 

Theorem  2.4  (Paterson).  Any  degree  n  real  polynomial  can 
be  evaluated  using  n/2  +  O(log  n)  multiplications. 

Moreover,  the  scalars  used  by  the  algorithm  are  rational 
functions  of  the  coefficients  of  the  polynomial. 

Proof  : 

Algorithm  C.  For  the  moment  assume  p(x)  is  raonic  (that  is, 
a  =1)  and  deg(p)  =  n  =  2**  -  1  for  some  positive  integer  J. 
First  compute  x2,  x\  x8,  •  •  • ,  x2  in  Llog  nj  multiplica¬ 

tions.  (all  logarithms  are  taken  to  base  2) 

Now  note  that  if  p(x)  is  mon<c  and  deg(p)  =  2ra  -  1 ,  then 
p(x)  can  be  written  as 

p ( x)  =  (xm  +  c)q(x)  +  r(x) 

where  q  and  r  are  monlc,  deg(q)  =  deg(r)  =  m  -  1,  and  the 
coefficients  of  q  and  r  are  given  by  rational  functions  of 
the  coefficients  of  p. 

In  fact  a2”'1  +  »2m.2x2m'2  +  •"  +  alx  +  a0 

*  <*»  +  cHx"-1  +  +  •••  +  %+i*  +  V 

+  (x10-1  +  bm_2xra"2  +  ...  +  bxx  +  bQ) 
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where  c  =  a„  1  -  1 
m  -l 

and  t>4  »  -  cam+1 


1=0,  • • • ,  m-2 . 


If  deg(p)  =  2m  -  1  *  2^  -  1, 

then  deg(q)  =  deg(r)  =  2^_1  -  1 

and  x^  =  x^  is  one  of  the  powers  that  were  computed 

at  the  beginning.  q(x)  and  r(x)  are  of  the  proper  form 

(monic  of  degree  2*  -  1  for  some  i)  and  the  procedure 

may  be  applied  recursively  to  them. 

Let  M(n)  =  the  number  of  multiplications  required  to 

evaluate  a  monic  degree  n  polynomial  by  this  procedure, 

o  II  q  o  nj 

assuming  that  x  ,  x  ,  x  ,  •••,  x  have  been  computed. 

Then  by  the  above  argument, 

M(2i  -  1)  =  2M(2i“1  -  1)  +  1  i  =  2,  3,  •••  J. 

Also  M ( 1 )  =  0  because  x  +  can  be  evaluated  using 

no  multiplications.  This  recurrence  relation  solves  as 


M(2i  -  1)  =  21”1  -  1  i  *  1.  2,  3.  •••.  J. 

So  M (n)  =  M(2J  -  1)  =  2'3"1  -  1  =  (n  +  l)/2  -  1. 

Allowing  one  more  multiplication  for  the  monic  division, 

M(n)  =  (n  +  l)/2. 

Total  multiplications  =  (n  +  l)/2  +  (.log  nj  if  n  =  2^  -  1. 

For  general  n,  the  polynomial  can  be  broken  into  pieces, 
each  of  degree  2^-1  for  some  i.  The  pieces  can  be  eval¬ 
uated  separately  and  put  back  together  using  the  powers 

p  h  Q 

x  ,  x  ,  x°,  •••.  The  putting-back-together  can  require  at 


most  another  log  n  multiplications  for  a  total  of  about 

n/2  +  21  og  n.  _ / 
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The  number  of  additions  used  by  Algorithm  C  can  also 
be  counted. 

Let  A(n)  =  the  number  of  additions  required  by  Algorithm 
C  to  evaluate  a  degree  n  polynomial. 

A  satisfies  the  recurrence  relation 

A(2*  -  1)  =  2A(2i_1  -  1)  +  2  i  =  2,3.4.  ... 

A(l)  =  1 

which  solves  as 

Ad1  -  1)  =  21  +  21"1  -  2  =  (3/2).  21  -  2  i  =  1,2,3,..- 
or  A(n)  *  (3/2)n. 

Therefore,  Algorithm  C  uses  2n  +  O(log  n)  arithmetic 
operations,  and  is  no  more  efficient  than  Horner's  rule  in 
this  respect.  However,  if  multiplication  speed  is  slower  than 
addition  speed.  Algorithm  C  is  more  efficient  than  non-pre¬ 
processing  schemes  such  as  Horner's  rule  if  the  polynomial 
is  to  be  evaluated  many  times. 

2*3  No  Preprocessing 

Since  an  algorithm  without  preprocessing  must  start 
from  the  aj^  themselves  as  scalars,  it  must  in  effect  evaluate 
the  general  polynomial 

P(x,  a0,  at.  •••.  an)  =  anxn  +  •••  +  a^  +  aQ 
where  the  a ^  as  well  as  x  are  Inputs.  The  following  result 
shows  that  Horner's  rule  is  an  optimal  algorithm  for  P. 
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Theorem  2.5  (Pan  [2],  Winograd[61) .  Any  algorithm  which 
computes  P(x,  aQ,  alt*»»,  an)  =  an*n  +  ...  +  e^x  +  aQ 
requires  at  least  n  multiplication/divisions. 

No  proof  will  be  given. 

Borodin  [8]  has  shown  that  Horner's  rule  is  the  only 
algorithm  which  evaluates  P(xf  a0,  a^,  •  an)  in  n 
multiplications  and  n  additions.  Therefore,  Horner's  rule 
is  a  unique  optimal  method  for  polynomial  evaluation  with 
no  preprocessing. 

2.4  Parallel  Algorithms 

In  previous  sections,  algorithms  were  assumed  to  be 
sequential.  Work  has  been  done  concerning  algorithms  which 
can  do  many  operations  at  each  step  and  some  of  the  results 
will  be  considered  next  without  pi  oofs. 

Parallel  algorithms  car  oe  formalized  as  follows.  At 
step  r,  the  algorithm  may  compute  m  terms  of  the  form 

uri  =  si  °Pj_  i  =  1,  •••,  m 

where  op^  6  {+.  *»  and  each  of  s^  and  t^  are  some 

input  variable,  some  fixed  number,  or  one  of  the  results  of 
some  previous  step  j,  1  <  J  <  r.  If  only  k  processors  are 
available,  then  m  <  k  at  each  step.  All  arithmetic  operations 
are  given  equal  weight.  The  most  Interesting  results  concern 
the  problem  of  evaluating  the  general  polynomial 
P(i,  a0,  alt  •••,  an)  =  an*n  +  ...  +  a^  +  a0 
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with  no  preprocessing. 

Estrin  [9]  has  given  a  parallel  algorithm  which  has 
the  merit  of  simplicity  of  debciption.  The  algorithm  computes 
p(x)  of  degree  n  as  p(x)  =  q(x)*xn/2+^  +  r(x) , 
where  deg(q)  =  deg(r)  =  n/2,  and  then  q  and  r  similarly 
by  a  binary  splitting,  and  so  on.  Thus  it  starts  by  computing 

x2,  axx  v  a0,  a3x  +  a2>  • • • .  ®nx  +  an_x 
in  the  first  two  steps,  then 

x\  (a^x  +  »2)x2  +  (a1x  +  aQ),  •••  in  the  next  two,  etc. 
If  an  unlimited  number  of  processors  are  available,  this 
scheme  requires  about  21og  n  steps. 

Dorn  [10]  has  modified  this  to  obtain  an  algorithm 
which  uses  only  k  <  n  processors  and  runs  in 
2n/k  +  21og  k  steps. 

Recent  work  by  Munro  and  Paterson  has  improved  these 
algorithms  to  reduce  the  coefficient  of  the  log  term  to  1. 

They  also  give  lower  bound  results. 

Theorem  2.6  (Munro,  Paterson  [11]).  Any  parallel  algorithm 
which  computes  the  genera],  polynomial  of  degree  n  requires 
at  least  [log  n]  +  1  steps.  If  only  k  processors  are 
available,  then  [2n/k]  +  flog  k]  -  1  steps  are  required. 

They  give  algorithms  which  run  in  time  closer  to  these 
lower  bounds  than  previous  results. 
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Theorem  2.7  (Munro,  Paterson[ll]) .  The  general  polynomial 
of  degree  n  can  be  evaluated  In  log  n  +  0(^1 og  n  )  steps 
by  a  parallel  algorithm  using  n  processors.  If  only  k 
processors  are  available,  0(log  n)  <  k  <  n,  then  the 
evaluation  can  be  done  In  2n/k  +  log  k  +  )  steps. 
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III.  Main  Results 

This  section  will  be  concerned  with  finding  lower  bounds 
on  algorithms  which  evaluate  rational  polynomials  and  exhibit¬ 
ing  algorithms  which  can  in  theory  be  used  to  evaluate  any 
rational  polynomial  using  a  number  of  multiplications  close 
to  the  lower  bound.  Since  any  set  of  rational  numbers  are 
necessarily  algebraically  dependent,  the  lower  bound  results 
of  section  II  are  not  directly  applicable.  In  fact,  one  might 
suspect  that  rational  polynomials  of  degree  n  could  be  evalu¬ 
ated  in  fewer  than  n/2  multiplications  because  an  integer 
multiplication  can  be  done  free  of  multiplication  by  doing 
many  additions,  z*u  =  u  +  u  +*..+  u  (z  times).  Any  rational 
polynomial  r(x)  e  Q[x]  is  of  the  form 

r(x)  =  rQ*z(x)  where  rQ  £  Q,  z(x)  £  Z[x], 

Because  of  this  fact,  it  is  useful  to  single  out  those 
multiplications  of  the  form  >u  where  c  is  a  fixed  number 
(that  is,  c  has  no  dependence  on  x). 

Definition.  Referring  to  the  definition  of  a  rational  algo¬ 
rithm  over  a  scalar  field  S,  the  step  a(r)  defines  a 
scalar  multiplication  if 

a(r)  =  (*,  i,  j)  and  either  a(i)  €  S  or  a(J)  e  S. 
a(r)  defines  a  scalar  division  if 

a(r)  =  (4-,  i,  j)  and  a(J)  €  S. 


Any  multiplication  or  division  which  is  not  scalar  will 
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be  called  non-scalar.  Non-scalar  multiplications  are  of 
the  form  q(x)*r(x)  and  cannot  be  eliminated  by  successive 
additions. 


3.1  Lower  Bounds 


The  first  lemma  shows  the  maximum  amount  of  computation 
that  can  be  done  with  a  certain  number  of  non-scalar  multipli¬ 
cation/divisions  . 


Lemma  3.1.  A  polynomial  p(x)  can  be  evaluated  in  no  more 
than  k  non-scalar  multiplication/divisions  by  a  rational 
algorithm  over  S  if  and  only  if  it  can  be  evaluated  by  the 
scheme  Ak  given  by 


®1  ■  <”10*  +  ml,-l>  opl  <“io*  + 

u2  *  <“21ul  +  ”20*  +  b2,-1>  op2  <”21ul  +  ”20*  +  “2.-1* 


in  general 

ur  *  <ilAlul  +  “ro*  +  "r.-'J  opr  +  ”r0*  +  "r.-l* 


.  r— 1  a 
ir 

L 

2,3. *  *  * 


and  finally 
p(x) 


k 

i=imk+1.lUi  +  mk+l,0x  +  mk+l , -1 


where 


°Pi  €  {*.  *) 


for  all  i,  j. 
for  all  1 


Proof  :  After  r-1  non-scalar  multiplication/divisions  have 
been  performed,  the  only  computation  that  can  be  performed 
without  doing  another  non-scalar  raultiplication/division  is 
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addition/subtractions  and  scalar  multlpllcatlon/dlvlsions  on 

{x,  u1#  uZt  •••.  Uj.i]  U  S 

.Any  such  computation  can  be  written  as 
r-1 

li’imriul  +  mrOx  +  mr,-l  where  mrl  6  s*  1  =  -*■••••  ,r 

Two  terms  of  this  type  can  be  formed  and  combined  with 
a  multiplication  or  division  to  get  the  next  non-scalar 
multiplication/division  step  ur*  _ / 

The  mjj  will  be  called  the  parameters  of  the  algorithmic 
scheme  Ak. 

For  what  follows.  It  will  be  useful  to  count  the  number 

of  parameters  In  A^.  The  expression  for  up  Introduces  2r  +  2 

parameters  and  the  expression  for  p(x)  Introduces  k  +  2  for 

a  total  of  £  (2r  +  2)  +  k  +  2  *  k2  +  4k  +  2. 
r*l 

The  first  theorem  digresses  for  a  moment  to  consider 
polynomials  with  algebraically  Independent  coefficients  In 
the  context  of  counting  only  non-scalar  multiplication/ 
divisions. 

Theorem  3.1  (M.  Paterson),  Any  p(x)  *  »  x”  +  •••  +  a,x  +  a„ 

n  xo 

with  Bq,  a1#  algebraically  Independent  real  numbers 

requires  at  least  [Vn  +  3"]  -  2  non-scalar  multiplication/ 
divisions  for  Its  evaluation  by  any  rational  algorithm  over  R. 

££.°.9f  !  Assume  p(x)  can  be  evaluated  In  k  non-scalar  multipli¬ 
cation/divisions  and  therefore  by  the  scheme  A^  for  some 
choice  of  m1Jf  m1J  €  R,  ovi  e  {*,  +}  . 
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Formally  carry  out  the  operations  In  Ak  and  view  the 
result  as  a  polynomial  In  x( 

•P(x)  *  pn(m)xn  +  ...  +  p1(m)x  +  pQ(m) 


m  m  t11  +  ...  x  *>  x  +  a. 

n  1  T  “o  * 


where  each  p^  Is  a  rational  function  In  m,  the  vector  of 
parameters  of  length  k2  +  4k  +  2. 

If  k2  +  4k  +  2  <  n  +  1,  then  by  Lemma  2.2,  there  Is 


a  rational  polynomial  P  ^  0  such  that 
P(p0(m) .  ...»  pn(m))  a  0 
or  P(aQ,  •••,  an)  a  o.  Contradiction. 

Therefore  k2  +  4k+2>n+l  or  k>  Jn  +  3  -  2.  / 


Lower  bound  results  for  rational  polynomials  will  now 
be  presented.  It  Is  difficult  to  obtain  Interesting  lower 
bounds  for  all  rational  polynomials  because  some  degree  n 
rational  polynomials  can  be  evaluated  fast  In  0(log  n) 
operations.  However,  the  lower  bounds  will  be  shown  to 
apply  to  "almost  all"  rational  polynomials.  A  set  of  rational 
vectors  will  satisfy  the  Intuitive  notion  of  being  almost  all 
rational  vectors  If  all  vectors  not  In  the  set  satisfy  a  non¬ 
trivial  rational  polynomial  relation. 

Definition.  A  set  S  Q  {q(x)  £  Q[x]  |  deg(q)  <  nj  will  be 
said  to  contain  almost  all  rational  polynomials  of  degree  n 
If  there  Is  a  P  €  Q[ynt  ’*’>  P  i.  0,  such  that 

qnxn  +  •••  +  qxx  +  qQ  e  Q[x]  -  S  Implies  P(qn,  •••,  q0)  =  0. 
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In  particular.  If  S  contains  almost  all  rational 
polynomials  of  degree  n,  then  S  4  0. 

If  S  »  0  then  +  . . .  +  q1x  +  qQ  €  Q[x]  -  S 

for  all  (qn,  •••,  qQ)  e  Qn+1  which  implies  that  P(Qn+1)  =  0. 
But  since  the  rationale  are  dense  in  the  reals  and  since  P  is 

n.  I 

continuous,  this  implies  P(R  )  a  0  or  P  a  0  contrary 
to  assumption. 

A  similar  argument  shows  that  if  S  contains  almost  all 
rational  polynomials  of  degree  n,  then 

{<qn.  •••.  q„>  e  «n+1  |  q,,*"  +  •••  +  qt*  H0ta) 

is  a  dense  subset  of  Rn+*. 

The  first  result  assumes  that  the  algorithm  contains 
no  divisions. 


Theorem  3.2  For  any  n  >  0,  there  are  rational  polynomials 
of  degree  n  which  require  [~./n  +  3~|  -  2  multiplications  for 
their  evaluation  by  any  rational  algorithm  over  R  without 
divisions.  In  fact,  almost  all  rational  polynomials  of 
degree  n  require  |"^i  +  3]  -  2  multiplications. 


Proof  :  Suppose  q(x)  *  qnxn  +  •••  +  qxx  +  qQ  £  Q[x]  can 
be  evaluated  in  k  non-scalar  multiplications  and  no  divisions 
and  therefore  by  the  scheme  for  op^  *  *,  i  = 
and  for  some  choice  of  m^,  m^j  €  R. 

Formally  carry  out  the  operations  in  A^  and  obtain 
q(x)  *  pn(m)xn  +  ...  +  p^mjx  +  pQ(m)  *  qnxn  +  •••  +  qjX  +  q0 
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where  the  p^  are  rational  polynomials  In  the  parameters  m. 
Assume  k  +  4k+2<n+l  and  find  a  rational  polynomial 
P  t  ?  such  that  P(pn(m) ,  pQ (m) )  =  0 

or  P(qn,  ••*,  q0)  =  0. 

Note  that  the  p^(m)  and  P  depend  only  on  the  form  A^, 
not  on  the  particular  polynomial  being  evaluated. 

If  all  rational  polynomials  can  be  evaluated  in  no  more 
than  k  non-scalar  multiplications,  then 

P(Qn.  •••.  q0)  =  0  for  all  (qn,  ••*,  qQ)  e  Qn+1 
But  since  the  rationals  are  dense  in  the  reals  and  since  P  is 
a  continuous  function,  this  implies 

P(rn*  •**»  r0)  *  0  for  all  (rn.  •••,  rQ)  e  Rn+1 
or  P  =  0  contrary  to  assumption. 
Therefore,  there  must  be  some  rational  polynomials  which 
require  at  least  k  non-scalar  multiplications  and  therefore 
at  least  k  multiplications  where 

k^  +  4k  +  2  >  n  +  1  oi  k  >  Jn  +  3  -  2. 

The  second  conclusion  of  the  theorem  (which  actually 
implies  the  first)  follows  from  the  fact  that 

P(qn*  ***•  <*o>  ^  0  implies  that  qnxn  +  •••  +  q^x  +  qQ 
requires  at  least  Jn  +  3  ■  2  multiplications  where 
P  0  is  as  above,  _ / 

This  lower  bound  can  be  improved  slightly  by  noticing 
that  the  parameters  in  A^  must  contain  some  redundancy. 
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Lemma  3.2  If  p(x)  can  be  computed  -by  the  scheme  A^  with 
no  divisions,  then  p(x)  can  be  computed  by  A^  with 
i).  m_  1  =  m_  i=0  r  =  1 , 2, • • • ,k 
il).  For  all  r,  1  <  r  <  k,  there  is  an  1  >  0  and  J  >  0 

A 

such  that  b  .  =  t  .  =  1 
ri  r  J 

ill).  m20  =  0. 


Proof  s  i).  Sequentially  for  r  =  l,2,.».,k  write 

ur  *  <ilimnul  +  mr0x  +  Aiui  +  W  +  “r.-l1 

as  ur  =  (^n.rlu1  +  »r0*)*(^1lirl»1  +  “r0*> 

+  illcrlul  +  °r0x  +  °r,-l 

A 

where  the  c^  are  scalars  obtained  from  the  mr^#  mr^  . 
Compute  instead  u£  =  +  mr0x)*  *iV“riui  +  ”r0x) 


and  adjust  the  coefficients  of  1,  x,  u1#  u2,  •••,  ur-1 

in  all  following  steps  to  compensate  for  the  lost  terms 

r-1 

£  c„.u,  +  c„  ,x  +  o  a 
jL-i  ri  1  ro  r ,  -1 


11).  For  r  =  l,2,«*»,k  ,  in  step  ur>  if 

rari  ^  0  and  ®rj  ^ 

compute  instead  u'  =  (1/m  .m  .  )»u 

r  ri  rj  r 

and  adjust  coefficients  of  ur  in  all  following  steps. 


Hi).  After  the  above  two  reductions  have  been  made, 


u2  =  (x2  +  ra20x)*(x2  +  m2Qx)  or  ug  =  (x2  +  m20x)*(x). 
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In  the  first  case,  compute  Instead 
u£  =  (x2)*(x2  +  (m20  +  n>20)x) 

and  adjust  the  coefficient ' of  u^  in  all  following  iteps  to 

^  2 

compensate  for  the  lost  term  m20ffi20x  * 

In  the  second  case,  compute  instead  u^  =  x^*  x  .  ____/ 

Corollary  3.2  The  lower  bound  of  Theorem  3.2  can  be 
raised  to  [~yn~|  multiplications  required. 

Proof  :  After  the  reductions  of  Lemma  3.2  have  been  made, 
there  are  k2  +  1  non-constant  parameters  in  Ak  and 
k2  +  1  >  n  +  1  gives  k  >  Jn  .  _ / 

A  similar  result  can  be  proven  using  combinatorial 
techniques  under  the  assumption  that  an  integer  polynomial 
is  being  evaluated  by  an  algorithm  over  Z.  (Since  there  are 
no  divisions,  S  need  only  be  a  ring).  The  proof  is  due  to 
Michael  Paterson. 


Theorem  3.3  (Paterson).  For  iny  n  >  0,  there  are  integer 
polynomials  of  degree  n  which  require  at  least  -  1 

multiplications  for  their  evaluation  by  any  algorithm  over 
Z  without  divisions. 


Proof  :  Consider  the  finite  ring  F  =  (o,  l} 
homomorphism  H  :  Z  — >  F  by 

1  if  z  is  odd 


and  the  ring 


H(z )  = 


if  z  is  even 
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First  It  Is  shown  that  If  zfixn  +  •••  +  z^x  +  zQ  G  Z[x] 
can  be  computed  In  k  non-scalar  multiplications  by  an  algorithm 
over  Z,  then  wn*n  +  •••  +  w^  +  wQ  G  F[x] 

where  =  H ( )  i=0,l,«-«,n, 

can  be  computed  In  k  non-scalar  multiplications  by  an  algorithm 
over  F. 

Formally  carry  out  the  operations  in  (with  no 

divisions)  and  obtain  p(x)  =  pn(m)xn  +•••+  p1(m)x  +  pQ (Tn ) 

where  each  p^  is  a  rational  polynomial  in  m. 

Now  zn*n  +  • • •  +  z^x  +  z0  computed  by  over  Z  implies 
k ' 

there  Is  an  m  e  Z  ,  where  k'  =  number  of  parameters  In  A^  , 
such  that  z^  =  p^ (m)  1  =  0,l,***,n 

which  Implies  w^  =  H(z1)  =  p*(H(m))  1  =  0,1, •••,n 

where  p*  means  "do  the  arithmetic  mod  2"  and  H(m)  Is  the 

k  ' 

vector  In  F  obtained  by  applying  H  to  each  element  of  m. 

This  implies  that  w^x11  +  •••  +  w^x  +  Wq  is  computed  by 
A^  over  F. 

Note  that  reduction  1)  cf  Lemma  3*2  can  be  done  if 
S  =  Z,  so  that  k'  =  k2  +  2k  +  2. 

Assume  k2+  2k  +  2<n+  1, 

The  proof  will  be  complete  if  we  can  present  polynomials 
of  degree  n  in  F[x]  which  cannot  be  computed  by  A^  over  F. 
There  are  such  polynomials  because  there  are  2n+  polynomials 
in  Fl_x],  but  only  2  different  polynomials  can  be 

computed  by  A ^  over  F. 

o  . 

k^  +  21c  +  2  >  n  +  1  or  k  >  Jn  -  1,  _ / 


Therefore 
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Now  divisions  are  allowed  in  the  algorithm  and  the  same 
lower  bound  is  shown  to  hold. 


Theorem  For  any  n  >  0,  almost  all  rational  polynomials 
of  degree  n  require  [ ~jf\  -  1  multiplication/divisions  for 
their  evaluation  by  any  rational  algor ithm  over  R. 


Proof  :  Note  that  reduction  ii)  of  Lemma  3.2  can  be  done  if 

the  algorithm  contains  divisions,  so  that  A,  contains 

2  *“ 
k  t  2k  +  2  parameters.  Assume  k2+  2k  +  2<n+  1. 

Consider  each  of  the  2k  algorithmic  forms  obtained 

from  Ak  by  independently  choosing  a  *  or  *  at  each  non-scalar 

step.  Using  techniques  of  Theorem  3.2,  for  each  of  these 

forms  find  a  €  Q^yn'  *“•  yo^’  Pi  ¥■  °»  such  that  if 

qnX"  +  •••  +  qxx  +  qQ  is  computed  by  the  i-th  form,  then 


P(V  <*n)  =  0. 


Define  P(yn,  ....  V  =  Jf^fy,,.  •••,  y0). 

P  £  Q[yn*  •••»  yQJ  and  P  ^  0. 

Also,  if  qnxn  +  ...  +  qlX  +  qQ  can  be  computed  ln  k  non_ 

scalar  multiplication/divisions,  then  it  can  be  computed  by 
the  ig-th  form,  for  some  ig,  so  that 


Pi0(qn*  <*0)  =  0  and  P(qn,  ...,  qQ)  =  0. 

Therefore  P(qn,  •••»  qg)  ^  0  implies  that 
qnxn  +  •••  +  qxx  +  qQ  requires  at  least  k  (non-scalar) 
multiplication/divisions  where  k2+2k+2>n+l. 


The  conclusion  follows.  / 


32 


Lower  bound  results  will  be  returned  to  later  in  the. 
context  of  bounded  additions. 

Several  algorithms  will  oe  presented  next  which  are 
optimal  to  within  a  constant  multiple  of  the  number  of 
non-scalar  multiplications  required. 


3. 2  Algorithms 


This  section  will  present  several  algorithms  which  can 
evaluate  any  degree  n  polynomial  using  0(,/T)  non-scalar 
multiplications  and  no  divisions.  The  algorithms  are 
applicable  to  any  real  polynomial  with  in  general  algebraic¬ 
ally  Independent  coefficients. 

Two  of  these  algorithms  can  in  theory  be  used  to  evaluate 
any  rational  polynomial  in  O(Jn)  total  multiplications 
because  if  q(x)  £  Q[x]  can  fee  evaluated  by  an  algorithm 
over  Q,  then  zQ*q(x)  €  Z[x^  can  be  evaluated  by  an  algorithm 
over  Z  using  the  same  number  of  non-scalar  multiplications, 
for  some  zQ  £  Z.  This  is  true  because,  for  r  ^  l,2,»**,k 

lf  ur  “  +  mr03t)*(ili”rlui  +  Sr0*> 

with  mrl,  mrl  6  Q  , 

compute  instead  u*  =  z  *u  where  z„  is  the  least  common 

r  r  r  r 

A 

multiple  of  the  denominators  of  the  mr^»  rarj/  and  adjust 
the  coefficients  of  ur  in  all  following  steps. 

If  the  algorithm  uses  no  preprocessing  or  rational 
preprocessing,  then  rational  coefficients  produce  rational 
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scalars  In  the  algorithm  and  the  above  procedure  may  be  used 
to  yield  an  algorithm  with  integer  scalars.  Therefore,  scalar 
multiplications  can  be  done  by  many  additions  and  on.y  non¬ 
scalar  multiplications  need  be  counted.  The  point  is  not 
that  one  would  ever  want  to  do  this  in  practice,  but  that  it 
would  be  impossible  to  obtain  a  stronger  than  0(,y£)  multipli¬ 
cations  required  lower  bound  for  rational  polynomials  if 
rational  polynomials  can  indeed  be  evaluated  in  0(Jn)  multi¬ 
plications  using  this  trick.  A  following  section  will  bound 
the  additions  and  show  that  a  stronger  lower  bound  holds. 

A  more  practical  use  for  these  algorithms  concerns  the 
problem  of  evaluating  a  real  or  rational  polynomial  at  a 
matrix  argument , 

p( A)  =  anAn  +  ••.  +  axA  +  aQI 
with  aQ,  • • • ,  an  e  R  and  A  a  real  matrix. 

For  example,  one  might  want  to  evaluate  such  a  polynomial 
if  evaluating  eAt  by  a  Tay.or  polynomial,  where  A  is  a 
matrix,  in  order  to  find  the  solution  of  a  system  of  linear 
differential  equations.  In  this  case,  a  non-scalar 
(matrix) • (matrix)  multiplication  is  slower  than  a  scalar 
(number) • (matrix)  multiplication  (using  state-of-the-art 
matrix  multiplication  procedures)  and  it  would  be  useful 
to  minimize  the  number  of  non-scalar  multiplications. 
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3.2.1  Description  of  the  Algorithms 

The  first  algorithm  is  an  "extended  Horner's  rule". 

Theorem  3.5  (M.  J.  Fischer,  A.  R.  Meyer,  M.  S.  Paterson). 

Any  degree  n  polynomial  can  be  evaluated  using  2[~,yrfJ  -  2 
non-scalar  multiplications  and  no  divisions.  No  prepro¬ 
cessing  of  coefficients  is  required. 

Proof  : 

Algorithm  D.  First  compute  x2,  x^,  x\  x^ 

for  some  k,  using  k  -  1  multiplications. 

Let  y  =  x^  and  write 

P(x)  =  anx"  +  ...  +  «!*  +  aQ 
as  p ( x)  =  Pm(x)y!n  +  ...  +  px(x)y  +  pQ(x) 
where  degfp^)  <  k  -  1  i  =  0,l,...,m 
and  m  =  fn/k|  -  1. 

In  fact  p0(x)  =  ak-lxk  +  +  aix  +  ao 

pl(x>  *  a2k-lxk'1  +  "•  +  ak+l*  +  % 
and  so  on. 

Each  of  these  p^(x)  can  be  computed  using  only  scalar 
multiplications  on  x,  x2,  x^,  •••,  xk_1  . 

Therefore,  p(x)  can  be  evaluated  in  m  additional  non-scalar 
multiplications  as 

p(x)  =  (((•••(Pm(x)y  +  +  Pm.2(x))y  +  •••  +  px(x))y 

+  P0(x). 

Minimising  k  +  n/k  -  2  with  respect  to  k  gives  k  =  ,/rT 
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or  2jn  -  2  total  non-scalar  multiplications  required.  _ / 

Note  that  Algorithm  D  uses  n  additions  and  about 
n  -  Jn  scalar  multiplications.  The  -  Jn  appears  because 
Jn  scalars  enter  through  additions. 

Algorithm  D  actually  gives  a  method  for  producing 
0(7n)  non-scalar  multiplication  algorithms  from  0(n)  algorithms 
In  which  all  multiplications  are  counted.  The  idea  Is  to 
compute  x2,  x^,  x\  •••,  for  some  k,  let  y  =  xk, 
m  =  n/k,  and  write  p(x)  of  degree  n  as 

p(x)  =  Pm(x)ym  +  •••  +  p^xjy  +  pQ(x) 

where  deg(p^)  <k  -  1  1  =  0 , 1 ,  •  •  •  ,m. 

This  polynomial  can  be  evaluated  in  0(m)  =  0(n/k)  additional 
non-scalar  multiplications  using  one  of  the  0(n)  algorithms 
In  which  all  multiplications  are  counted.  Using  this  method 
on  Horner's  rule  yields  Algorithm  D.  Complications  arise 
If  the  0(n)  algorithm  uses  preprocessing  because  the  pre¬ 
processing  must  be  done  before  the  "coefficients"  p^(x) 
are  evaluated.  This  method  does  give  insights  toward 
producing  better  O(Jn)  algorithms.  The  first  of  these  was 
obtained  from  Algorithm  C. 

Theorem  3.6  Any  polynomial  of  degree  n  can  be  evaluated 
using  Jzn  +  O(log  n)  non-scalar  multiplications  and  no 
divisions.  Moreover,  the  scalars  used  by  the  algorithm 
are  rational  functions  of  the  coefficients  of  the  polynomial. 
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Proof  : 

Algorithm  E.  Assume  n  »  k(2J  -  1)  for  some  Integers  k,  J. 
IK  Compute  x^,  x^,  x4,  xk  (k  multiplications) 

2).  Compute  x2k,  x4k,  x8k,  x2^’^  (log  n/k 

e 

multiplications ) 

Let  p(x)  be  monlc  of  degree  k(2m  -  1)  expressed  In 
the  form  p(x)  =  q(x)*xkm  +  r(x) 
where  q  is  monlc,  deg(q)  a  k(m-l),  deg(r)  <  km  -  1. 
Formally  divide  r(x)  -  x^®"1)  by  q(x) 
obtaining  r(x)  -  x1^®"1)  a  c(x)»q(x)  +  s(x) 

where  deg(c)  <  k  -  1,  deg(s)  <  k(m  -  1)  -  1. 

Therefore  p(x)  =  (x1™  +  -c(x))q(x)  +  (xk(m‘1)  +  s(x)) 

or  p(x)  a  (x15®  +  c(x))q(x)  +  s(x) 

where  deg(c)  £  k  -  1,  deg(q)  a  deg(s)  =  k(m  -  1), 

q  and  s  are  monlc,  and  the  coefficients  of  c,  q,  s  are 

rational  functions  of  the  coefficients  of  p. 

If  m  a  21”1  for  some  x,  2  <  1  <  J, 
then  p  Is  of  the  form  monlc  of  degree  k(21  -  1), 
and  q  and  s  are  of  the  same  form  monlc  of  degree  k(21”^'  -  1). 
Also  x1™  Is  one  of  the  powers  computed  in  2), 

Also  deg(c)  <  k  -  1  Implies  that  c  can  be  computed  free 
of  non-scalar  multiplications. 

Let  M(n)  a  the  number  of  non-scalar  multiplications 
required  to  evaluate  a  degree  n  monlc  polynomial  assuming 
that  the  powers  In  1)  and  2)  above  have  been  computed. 
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Then  by  the  above  argument 

M(k(21  -  1))  *  2M(k(21-1  -  1))  +  1  i  =  2,3 . . 

Also.  M (k)  =0  because  It  only  Involves  scalar  multipli¬ 
cations  and  additions. 

This  recurrence  relation  solves  as 

M(k(21  -  1))  =  21"1  -  1  1  *  I.2.3.*** .3 

or  M(n)  «  2J-1  -  1  *  n/2k  . 

The  total  number  of  non-scalar  multiplications  Is 
n/2k  +  k  +  log  n/k  . 

Minimizing  with  respect  to  k  gives  k  **  Jn/Z 

or  JZn  +  log  J2xi  non-scalar  multiplications  required. 

As  In  Algorithm  C,  for  general  n  this  may  required  an 
extra  log  Jzii  multiplications.  _/ 

Algorithm  E  uses  about  n  -  Jzxi  scalar  multiplications. 
The  number  of  additions  can  be  counted  as 
A(k(21  -  D)  =  2A(k(21’1  -  1))  +  k  +  1 
A(k)  =  k 

which  solves  as 

A(k(2l  -  D)  •  k(2*  -  1)  +  21"1  -  1  1  * 

or  A(n)K  n  +  n/2k  *  n  +  Jn/Z. 

The  next  result  Is  obtained  from  Algorithm  B. 

Theorem  3.7  Any  polynomial  of  degree  n  can  be  evaluated 
using  +  2  non-scalar  multiplications  and  no  divisions. 
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Proof  : 

Algorithm  F.  Refer  to  Algorithm  B  which  evaluates  p(x)  of 
degree  n  as  y  =  x  +  c  w  *  y2 

z  =  (afiy  +  s0)y  +  tQ  (n  even) 
z  *  any  +  t^  (n  odd) 

p(x)  -  (((•••(z(w  -  )  +  tx)(w  -  s2)  +  tg)...)(w  -  sm) 

m  =  [~n/2~|  -  1 

where  c,  s^,  t^  £  R  1  =  0,1, •••,ra. 

Also  compute  w2,  w^,  w\  • • • ,  w^  for  some  k. 

Note  that  If  z(x)  Is  any  polynomial,  then 

(((•••  (z(x)  (w  -  Sj^)  +  tx)(w  -  s2)  +  t2)  *  •  • )  (w  -  sk)  +  tk 
can  be  written  as 

z(x)q(w)  +  r(w) 

where  deg(q)  =  k  and  deg(r)  <  k  -  1  . 

In  fact  q(w)  m  (w  -  s^fw  -  s2)*««(w  -  sk) 

and  r (w)  =  ((•••(t1(w  -  Sg)  +  tg)(w  -  s^)  +  t^) 

• • • ) (w  -  sk)  +  tk  . 

Applying  this  m/k  =  n/2k  times  to  the  form  for  p(x)  above, 
p(x)  =  (((•••  (z»q1(w)  +  r1(w))q2(w)  +  r2(w))  •  •  •  )qm/k('») 

+  rmA(w)  • 

where  deg(q^)  =  k,  deg(r^)  <  k  -  1  1  =  1,2,*  •*viiA* 

Each  of  the  q^  and  r^  can  be  computed  using  only  scalar 
multiplications  and  additions  on  w,  w2,  w-^,  •••,  w1*. 
Therefore,  at  most  k  +  n/2k  +  2  non-scalar  multiplica¬ 
tions  are  required. 
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Minimizing  with  respect  to  k  gives  k  a  Jn/2 
or  Jzii  +  2 

non-scalar  multiplications  required.  _ / 

Note  that  Algorithm  P  uses  about  n  additions  and  about 
n  -  yST  scalar  multiplications. 

Algorithm  P  uses  general  algebraic  (root-finding) 
preprocessing  Just  as  Algorithm  B  does. 

3*2,2  Comparison  of  the  Algorithms 

It  is  Interesting  to  compare  the  efficiency  of  some  of 
these  algorithms  when  evaluating  a  polynomial  of  degree  n  at 
an  m  x  m  matrix  argument . 

Let  Cost  of  non-scalar  multiplication  =  m^ 

Cost  of  scalar  multiplication  *  m2 
Cost  of  addition  =  m2 
Then  Horner's  rule  has  cost  nm^  +  nm2 
Algorithm  D  has  cost  about  ?.Jn  m^  +  2nm2 

Algorithm  D  is  therefore  more  efficient  thar.  Horner's  rule  if 
n  >  4(m/(m-l))2 

Algorithm  E  may  also  be  compared.  Preprocessing  al¬ 
gorithms  are  usually  efficient  only  if  the  polynomial  is  to 
be  evaluated  enough  times  to  recover  the  time  lost  doing 
preprocessing.  However  in  this  oase,  since  the  preprocessing 
Involves  only  operations  on  numbers  (not  matricies),  pre¬ 
processing  methods  would  be  more  efficient  even  for  a  single 


40 


evaluation  of  a  polynomial  with  matrix  argument  If  the 
polynomial  Is  small  enough  and  the  matrix  large  enough. 

Slnc.e  the  preprocessing  done  by  Algorithm  E  is  rational, 
the  number  of  operations  involved  can  be  counted. 

Let  P(n)  =  the  number  of  preprocessing  operations 
required  for  degree  n  polynomial. 

The  preprocessing  for  degree  k(2*  -  1)  is  reduced 
to  the  preprocessing  for  two  degree  k(21"1  -  1)  polynomials 
plus  the  division  of  a  degree  k»2^~^  -  1  polynomial  by  a 
degree  k«21_1  -  k  polynomial.  This  division  can  be  done  in 
about  2k(k(21-1  -  1))  <  k2^1  operations. 

Therefore  P(k(21  -  1))  <  2P(k(21"1  -  1))  +  2*.k2 
and  P(k)  =  0, 

which  solves  as  P(k(2*  -  1))  <  k2(l  -  1 )  - 21 

or  P(n)  <  (^/2)2( log  Jzk)(Jzk)  <  (n3/2.log  n )/Jz  . 
Algorithm  E  is  more  efficient  than  Algorithm  D  if 
J2n  m3  +  2nm2  +  (n3/2.log  \\)/Jz  <  zjn  m3  +  2nm2 

or  n  log  n  <  «m3 

3.2.3  Conditioning 

The  reader  should  be  warned  that  the  fact  that  an 
algorithm  looks  efficient  on  paper  does  not  imply  that  it 
is  well  suited  for  practical  use  in  all  cases.  Actual 
machines  are  not  perfect  computing  devices  but  represent 
numbers  only  to  some  fixed  decimal  accuracy.  One  numerical 
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problem  that  arises  concerning  polynomial  evaluation  algorithms 
is  the  problem  of  the  conditioning  of  an  algorithmic  form. 

A  polynomial  evaluation  method  can  be  viewed  as  a  transforma¬ 
tion  which  maps  a  set  of  parameters  having  at  least  n  +  1 
degrees  of  freedom  into  the  set  of  all  real  polynomials  of 
degree  n.  An  algoritnm  which  evaluates  a  specific  polynomial 
is  obtained  by  substituting  the  proper  numbers  for  the 
parameters.  For  example,  in  Algorithm  B,  c,  s^,  t^  are 
the  parameters. 

Informally,  an  algorithmic  form  is  said  to  be  Ill- 
conditioned  if  "small"  errors  in  the  parameters  produce 
"large"  errors  in  the  polynomial  being  evaluated.  Ill- 
conditioning  is  a  property  of  the  transformation  defined  by 
the  algorithmic  form  and  is  independent  of  round-off  effects 
and  other  problems  which  arise  during  the  sctual  execution 
of  the  algorithm.  Informally,  an  algorithm  which  evaluates 
a  specific  polynomial  is  ill- conditioned  if  the  error  in 
computed  values  of  the  polynomial  is  larger  than  one  would 
expect  from  normal  round-off  effects. 

Experiments  done  by  Rice  [12]  indicate  that  Algorithm  B 
is  more  likely  to  be  ill-conditioned  than  Horner’s  rule. 

To  my  knowledge,  no  analysis,  either  experimental  or  theo¬ 
retical,  has  been  done  concerning  the  conditioning  of  the 
rational  preprocessing  Algorithm  C. 

Since  the  O(Jn)  algorithms  D,  E,  F  have  basically  the 
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same  form  as  the  0(n)  algorithms  from  which  they  are  derived, 
it  could  be  expected  that  they  would  have  similar  conditioning 
as  algorithms  A,  C,  B,  respectively. 

For  a  further  discussion  of  conditioning,  see  for 
example  [12],  [13]. 

3.3  Possible  Improvements 

The  previous  two  sections  have  shown  that  any  polynomial 
of  degree  n  with  algebraically  Independent  coefficients 
requires  at  least  ,/n"  -  1  but  not  more  than  Jzn  +  2  non¬ 
scalar  multlplicatlon/dlvisions  for  its  evaluation. 

Presumably  It  should  be  possible  to  improve  one  or  both  of 
these  bounds  to  reach  some  optimal  point  inbe tween. 

One  way  In  which  the  lower  bound  might  be  raised  Is 
as  follows.  Notice  that  the  algorithmic  scheme  (page  23) 
actually  generates  a  result  of  degree  2  If  it  contains  no 
divisions. 

p(x)  •  P  kffi)*2*  +  ...  +  +  P0<m) 

Pt  e  Q[n]  1  =  0.1.2, •••.2k. 

If  A^  Is  computing  a  polynomial  of  degree  n, 
p(x)  =  anxn  +  ...  +  OjX  +  80 

then  not  only  P^m)  =  a^  1=0,  1,  2,  *  **,  n 

_»  k 

but  also  p  (m)  =  0  1  =  n+1,  n+2,  •**,  2 

The  equations  p^(m)  =  0  1  =  n+1,  n+2,  •••,  2k  place 

constraints  on  the  parameters  m  so  that  they  cannot  have 
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their  full  k2  +  4k  +  2  degrees  of  freedom  but  something 
less.  If  A.  contains  divisions,  then  it  Is  possible  to 
keep,  the  degree  of  the  result  small,  but  again  so:".e  degrees 
of  freedom  In  the  parameters  must  be  lost  to  assure  that 
A.  is  computing  a  polynomial  and  not  a  general  rational 

JV 

function. 

Algorithm  F  {fe n  non-scalar  multiplications)  Is 
probably  not  optimal  because  the  first  k  =  Jn/Z  multipli¬ 
cations  w2,  vP,  •••,  wk  are  wasted  in  the  sense  that  they 
Introduce  no  parameta'*s  Into  the  algorithm.  There  probably 
are  schemes  which  use  fewer  than  «/2n  non-scalar  multlpllca- 
tlon/dlvisions  but  require  complicated  preprocessing. 

Given  aQ,  a1§  •••,  afi  ,  the  preprocessing  would  require 
the  solution  of  P1<m)  =  a1  1  =  0,1.2,...,n 

for  the  parameters  m,  where  the  p^  are  complicated  rational 

functions  in  m. 

Thus  fe  appears  to  be  the  best  lower  bound  obtainable 
by  a  simple  degrees  of  freed:m  argument  alone,  fen  Is 
probably  close  to  the  best  algorithm  with  a  simple  description 

and  reasonable  preprocessing. 

Another  gap  presently  exists  concerning  the  number  of 
multiplications  required  to  evaluate  a  polynomial  with 
algebraically  Independent  coefficients  and  with  rational 
preprocessing.  The  best  known  algorithm  (Algorithm  C)  uses 
n/2  +  O(log  n)  multiplications.  Any  improvement  of  the 
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n/2  lower  bound  (Theorem  2.2)  assuming  rational  preprocessing 
will  require  an  argument  which  goes  beyond  simple  degrees  of 
freedom  arguments.  Of  course,  there  may  be  a  better  algorithm 
also. 


3.4  Lower  Bounds  on  Efficient  Evaluation 

As  shown  In  section  3.2,  any  rational  polynomial  can 
be  evaluated  In  0 C^/n")  multiplications.  This  economy  of 
multiplications  is  gained  at  the  expense  of  doing  a  very 
large  number  of  additions.  Since  addition  Is  never  actually 
free,  this  method  must  be  Inefficient  In  general.  This 
section  will  show  that  almost  all  rational  polynomials  of 
degree  n  require  n/'2  multlpllcatlon/dlvlslons  for  their 
efficient  evaluation  If  addition  has  positive  cost. 

Let  ca  €  R,  Cfl  >  0,  denote  the  cost  of  addition  and 
cffl  e  R.  >  0,  denote  the  cost  of  multiplication. 

Define  the  cost  of  a  rational  algorithm  to  be 

n • c  +  n  •  c 
a  a  mm 

where  nfl  =  number  of  addition/subtractions  used 

*  number  of  multiplication/divisions  used. 

Theorem  3.8  If  ca  >  0,  then  almost  all  rational  polynomials 
of  degree  n  require  at  least  fn/2*|  multlpllcatlon/dlvlslons 
for  their  evaluation  by  any  least  cost  rational  algorithm 
over  R. 
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Proof  :  Assume  q(x)  =  q^11  +  •••  +  q^  +  qQ  t  Q[x] 

can  be  evaluated  in  no  more  than  k  multiplication/divisions 

and  therefore  by  the  scheme  (see  page  12). 

Assume  k  <  fn/2~|  . 

q(x)  can  be  evaluated  by  Horner's  rule  with  cost 

n(ca  +  cm)  *  Let  B  =  [n(c*  +  Cm)/Ca]  * 

Any  algorithm  v;hich  evaluates  q(x)  using  more  than  B 

addition/subtractions  cannot  be  a  least  cost  algorithm. 

Consider  each  of  the  algorithmic  forms  obtained  from 


Sk  by  independently  choosing  a  *  or  -r  at  each  step  ur, 
and  choosing  some  substitution  of  integers  for  the 


such  that  the  algorithmic  form  uses  <  B  addition/subtractions. 


Remember  that  m^u^  is  shorthand  for  m^  additions. 
There  are  only  a  finite  number  N  of  such  forms. 


The  1-th  form  computes  a  result  of  the  form 


p(x)  *  pln(s)xn  +  ...  +  p11(s)x  +  P10(s) 

•  «„*"  +  •••  +  +  q0 

where  s  is  of  length  <  n  +  1. 


Therefore,  for  1  =  1,2,,*#>N  ,  there  is  a 

Pi  e  QCyn'  *“•  yo3*  pi  *  °*  such  that 
if  qn*n  +  •••  +  q^x  +  qQ  is  computed  by  the  i-th  form 

then  P1(Qn.  * '  * .  <10)  =  0  ’ 


Define  P(yn,  •••»  y(j)  =  i^iPi^yn’  y0^* 

Then  qnxn  +  •  •  •  +  q^x  +  qQ  computed  by  a  least  cost 

algorithm  using  less  than  [n/2]  multiplication/divisions 
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Implies  that  P(qn,  Oq)  =0.  / 

As  the  ratio  ca/cm  grows  smaller,  there  are  more 

rational  polynomials  that  can  be  evaluated  using  less  than 

n/2  multiplication/divisions,  but  as  long  as  c  7c  >0 
,  a  m  *  * 

almost  all  require  n/2. 

A  similar  argument  shows  that  almost  all  rational 
polynomials  of  degree  n  require  n  additions  for  their 
evaluation  by  an  efficient  algorithm. 

Theorem  3.9  If  Cr  >  0  and  cffl  >  0,  then  almost  all 
rational  polynomials  of  degree  n  require  at  least  [n/2] 
multiplication/divisions  and  at  least  n  addition/subtrac¬ 
tions  for  their  evaluation  by  any  least  cost  rational 
algorithm  over  R. 

Proof  :  We  use  the  fact  that  for  every  rational  algorithmic 
fora  using  some  number  of  multlpllcatlon/dlvlslons  and  less 
than  n  addition/subtractions,  there  is  a 

Pi€  Q[ynt  •••»  y0]»  ^  0,  such  that 

if  qnx  +  •••  +  QjX  +  qQ  is  computed  by  that  form, 
then  Pj^  (qn,  •  •  • .  qQ)  =  0. 

This  follows  from  the  fact  that  if  the  form  contains  less 
than  n  addition/subtractions,  then  the  coefficients  of  the 
result  of  the  form  can  be  parameterized  in  less  than  n  4  1 
parameters.  (See  Theorem  2.3  and  Knuth  [41). 

The  number  of  multlpllcatlon/dlvlslons  In  a  least 


cost  algorithm  can  be  bounded  by  B  =  [n(ca  +  °m>/om| 

There  are  only  a  finite  number  of  algorithmic  forms 
using  fewer  than  n  addition/subtractions  and  not  tore  than 
B  multlpllcatlon/dlvlslons.  As  In  Theorem  3.8.  construct 

a  p  e  Q[yn.  y0l-  ?  *  0l  suoh  that 

if  qn3tn  +  ...  +  qjX  +  90  13  00"tPuted  by  a  laast  oost 

algorithm  using  less  than  n  addition/subtractions, 
then  ^  =  °* 


Let  p(y_.  •••.  y0>  ba  as  ln  Thaorem  3-8‘ 
Define  P(yR.  •••*  V  o'*  =  p(yn’  ***’  y0)'P(yn, 
The  conclusion  follows.  - / 


Theorem  3.9  states  that  any  computational  savings  that 
take  advantage  of  the  algebraic  dependence  of  rational 
numbers  can  never  work  for  all  rational  polynomials. 

The  last  two  lower  bound  results  allow  preprocessing. 

It  is  reasonable  to  ask  what  kind  of  lower  bound  for  rational 
polynomial  evaluation  can  be  jroven  assuming  no  preprocessing. 
The  question  Is  almost  answered  by  Theorem  2.5  which  states 
that  any  algorithm  which  computes  the  general  polynomial 

P<*.  ao-  ai*  -•  an>  =Vn+  •"  +  al’  +  a0 

requires  at  least  n  multlpllcatlon/dlvlslons. 

Notice,  however,  that  there  are  no-preprocesslng 
algorithms  that  use  less  than  n  multlpllcatlon/dlvlslons 
and  work  for  some  Infinite  set  of  real  polynomials. 
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For  example 

A 

P(Xf  ft-  t  •  •  •  -  Q,  )  —  vn  4.  p  — .^“1  . 

0  1*  .  an;  _  X  +  an-1x  +  ...  +  &1x  +  a{) 

can  be  evaluated  in  n  -  1  multiplications  and  the  algorithm 
works  for  all  monic  polynomials  but  gives  the  wrong  answer 
fer  any  non-monic  polynomial.  Therefore,  a  no-preprocessing 
algorithm  which  used  less  than  n  multiplication/divisions, 
gave  wrong  answers  in  general  for  all  real  polynomials,  but 
happened  to  give  the  right  answers  for  all  rational  polynomials 
would  not  directly  contradict  Theorem  2.5.  However,  this 
cannot  happen  because  the  rationals  are  dense  in  the  reals. 


Theorem  ?.10  Any  rational  algorithm  over  B  which  evaluates 

P(x,  aQf  a1§  •••,  an)  =  aRxn  +  . . .  +  &1x  +  aQ  correctly 

for  axl  aQt  ai*  ***.  «n  6  Q  requires  at  least  n  multiplica¬ 
tion/divisions. 

Proof  :  Formally  carrying  out  the  operations  In  the  no- 
preprocessing  algorithm,  the  algorithm  Is  seen  to  compute 
a  result  of  the  form 


p(x)  =  pn(a)xn  +  ...  +  p^ajx  +  PQ(a) 


where  a  =  (aQ . an) 

and  P^  £  R(aQI  an) 


i  =  0,1, • • • ,n  , 


that  is,  the  pA  are  rational  functions  in  the  Sl  with  real 


coefficients. 


Let  t  g  Rn+1  be  any  real  vector  such  that 

Pi(t)  4  «  i  =  0,1, ... ,n. 

Pick  a  region  about  t  within  which  all  the  ?1  are  continuous, 
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Within  this  regionm  find  a  sequence  of  rational  vectors 
1  =  1.2*3* •••  such  that  lira  *q^  -  t. 

Since  the  algorithm  works  for  all  rational  polynomials, 

we  have  (Pq^),  ...  pn(q1))  =  q  1  i  =  1,2,3,... 

< 

Therefore 

(P0(t).  •**.  Pn(t))  =  (p0(limq1),  ...,  pn(lira  q^) 

=  lira  (Pq^),  p^q^) 

=  lira  ~q^  = 

Therefore  the  algorithm  works  for  all  real  polynomials 
for  which  it  does  not  diverge.  (It  can  diverge  only  on  a  set 
of  measure  zero).  Therefore,  by  Theorem  2.5,  the  algorithm 
requires  n  multiplication/divisions.  _ / 

Algorithm  D  was  stated  to  be  a  no-preprocessing  algorithm 
which  could  be  used  to  evaluate  any  rational  polynomial 
in  2^/rT  multiplications.  If  this  algorithm  is  used  for 
rational  polynomial  evaluation,  there  actually  is  implicit 
preprocessing  because  z.u,  z  €  Z,  must  be  translated 
into  u  +  u  +  ...  +  u  (z  times).  There  is  no  way  for 
a  rational  algorithm  to  do  this.  However,  an  algorithm 
with  comparison  and  branching  instructions  could. 

It  is  not  hard  to  construct  an  algorithm  using  the 
perfect  real  arithmetic  operations  and  comparison  and  branch¬ 
ing  Instructions  which  would  evaluate 

P(x,  aQ,  alf  .•.,  an)  =  anxn  +  ...  +  a^  +  aQ 
correctly  for  all  aQ,  ax ,  • . . ,  &n  £  Q,  x  £  H 
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1”S  ^  a„d  „0  dlTlslonE  (and  many 

;  ”S  and  °0"P81'1S0nS>’  **  "cul*  ^erge  if  sny  of  th£ 

KrMmtlm1'  *.«-,***„  Bi  t 

"J"e  n+2  ,nPUt  reSlSterS’  “<*  <*  Which  is  cspsMe  of 

holding  a  perfect  real  number. 


Therefore, 
necessary  for  a 


the  assumption  of  rational  algorithm 
result  such  as  Theorem  3.10. 


is 
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IV.  Summary. 

The  main  purpose  of  this  work  has  been  to  exhibit  bounds 
on  the  number  of  arithmetic  operations  required  to  evaluate 
rational  polynomials. 

Since  integer  multiplications  can  be  done  by  successive 
additions,  the  analysis  of  lower  bounds  was  facilitated  by 
only  counting  multiplication/divisions  that  involve  x,  the 
argument  variable  of  the  polynomial  being  evaluated,  on  both 
sides.  These  are  called  non-scalar  multiplication/divisions. 
The  first  lower  bound  result  (Theorem  3**0  shows  that  almost 
all  rational  polynomials  of  degree  n  require  Jr\  (non-scalar) 
multiplication/divisions  for  their  evaluation.  Almost  all 
rational  polynomials  are  all  those  off  some  set  of  measure 
zero. 

Several  algorithms  were  presented  that  can  evaluate 
any  polynomial  of  degree  n  using  0{*/n)  non-scalar  multiplica¬ 
tions,  0(n)  scalar  multiplications,  and  0(n)  additions. 

These  algorithms  can  in  theory  be  used  to  evaluate  any 
rational  polynomial  of  degree  n  using  O(Jn)  total  multipli¬ 
cations  by  doing  integer  scalar  multiplications  by  many 
additions.  These  algorithms  may  have  practical  use  in  cases 
where  non-scalar  multiplications  are  more  expensive  than 
scaler  multiplications.  This  would  be  true  if  the  polynomial 
is  being  evaluated  at  a  matrix  argument  or  at  some  other 
mathematical  object  for  which  it  is  harder  to  multiply  two 
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objects  than  it  is  to  multiply  an  object  by  a  number. 

The  trick  of  simulating  an  integer  multiplication  by 
many  additions  must  give  Inefficient  algorithms  ir  general. 
This  is  formalized  by  a  lower  bound  result  (Theorem  3.9) 
which  shows  that  almost  all  rational  polynomials  of  degree  n 
require  n/2  multiplication/divisions  and  n  addition/subtrac¬ 
tions  for  their  efficient  evaluation.  These  are  the  same 
(achievable)  lower  bounds  that  apply  to  any  polynomial  with 
algebraically  independent  coefficients. 

The  techniques  described  herein  could  easily  be  applied 
to  extend  other  degrees  of  freedom  arguments  to  apply  to 
rational  numbers.  Consider  the  following  theorem  of  Winograd. 

Theorem  (Winograd  [61).  If  A  Isa  p*q  matrix  whose 
entries  form  a  set  of  algebraically  independent  real  numbers 
and  if  v  is  a  q -vector  input,  then  any  rational  algorithm 
which  computes  the  p  elements  of  A«v  requires  at  least 
[^pq]  multiplication/divisions. 

Using  techniques  of  Theorem  3*9,  this  can  be  extended 
to  the  following. 

Theorem  If  addition  has  positive  cost,  then  for  almost  all 
pXq  matricies  A  with  rational  entries,  any  least  cost 
rational  algorithm  which  computes  A*v  requires  at  least 
f"|pqj  multlpllcatlon/divisicns. 
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